345 research outputs found
Deep Classifier Mimicry without Data Access
Access to pre-trained models has recently emerged as a standard across
numerous machine learning domains. Unfortunately, access to the original data
the models were trained on may not equally be granted. This makes it
tremendously challenging to fine-tune, compress models, adapt continually, or
to do any other type of data-driven update. We posit that original data access
may however not be required. Specifically, we propose Contrastive Abductive
Knowledge Extraction (CAKE), a model-agnostic knowledge distillation procedure
that mimics deep classifiers without access to the original data. To this end,
CAKE generates pairs of noisy synthetic samples and diffuses them contrastively
toward a model's decision boundary. We empirically corroborate CAKE's
effectiveness using several benchmark datasets and various architectural
choices, paving the way for broad application.Comment: 10 pages main, 4 figures, 2 tables, 2 pages appendi
Self Expanding Neural Networks
The results of training a neural network are heavily dependent on the
architecture chosen; and even a modification of only the size of the network,
however small, typically involves restarting the training process. In contrast
to this, we begin training with a small architecture, only increase its
capacity as necessary for the problem, and avoid interfering with previous
optimization while doing so. We thereby introduce a natural gradient based
approach which intuitively expands both the width and depth of a neural network
when this is likely to substantially reduce the hypothetical converged training
loss. We prove an upper bound on the "rate" at which neurons are added, and a
computationally cheap lower bound on the expansion score. We illustrate the
benefits of such Self-Expanding Neural Networks in both classification and
regression problems, including those where the appropriate architecture size is
substantially uncertain a priori.Comment: 10 pages, 4 figure
Recommended from our members
Green-Emissive Zn2+ Complex Supported by a Macrocyclic Schiff-Base/Calix[4]arene-Ligand: Crystallographic and Spectroscopic Characterization
The macrocyclic calix[4]arene ligand H2L comprises two non-fluorescent 2,6-bis-(iminomethyl)phenolate chromophores, which show a chelation-enhanced fluorescence enhancement upon Zn2+ ion complexation. Macrocyclic [ZnL] complexes aggregate in the absence of external coligands via intermolecular Zn−N bonds to give dimeric [ZnL]2 structures comprising two five-coordinated Zn2+ ions. The absorption and emission wavelengths are bathochromically shifted upon going from the liquid (λmax,abs (CH2Cl2)=404 nm, λmax,em (CH2Cl2)=484 nm) to the solid state (λmax,abs=424 nm (4 wt%, BaSO4 pellet), λmax,em=524 nm (neat solid)). Insights into the electronic nature of the UV-vis transitions were obtained with time-dependent density functional theory (TD-DFT) calculations for a truncated model complex
Leveraging Diffusion-Based Image Variations for Robust Training on Poisoned Data
Backdoor attacks pose a serious security threat for training neural networks
as they surreptitiously introduce hidden functionalities into a model. Such
backdoors remain silent during inference on clean inputs, evading detection due
to inconspicuous behavior. However, once a specific trigger pattern appears in
the input data, the backdoor activates, causing the model to execute its
concealed function. Detecting such poisoned samples within vast datasets is
virtually impossible through manual inspection. To address this challenge, we
propose a novel approach that enables model training on potentially poisoned
datasets by utilizing the power of recent diffusion models. Specifically, we
create synthetic variations of all training samples, leveraging the inherent
resilience of diffusion models to potential trigger patterns in the data. By
combining this generative approach with knowledge distillation, we produce
student models that maintain their general performance on the task while
exhibiting robust resistance to backdoor triggers.Comment: 11 pages, 3 tables, 2 figure
FEATHERS: Federated Architecture and Hyperparameter Search
Deep neural architectures have profound impact on achieved performance in
many of today's AI tasks, yet, their design still heavily relies on human prior
knowledge and experience. Neural architecture search (NAS) together with
hyperparameter optimization (HO) helps to reduce this dependence. However,
state of the art NAS and HO rapidly become infeasible with increasing amount of
data being stored in a distributed fashion, typically violating data privacy
regulations such as GDPR and CCPA. As a remedy, we introduce FEATHERS -
derated rchiecture and
ypparameter earch, a method that not only
optimizes both neural architectures and optimization-related hyperparameters
jointly in distributed data settings, but further adheres to data privacy
through the use of differential privacy (DP). We show that FEATHERS efficiently
optimizes architectural and optimization-related hyperparameters alike, while
demonstrating convergence on classification tasks at no detriment to model
performance when complying with privacy constraints.Comment: Main paper: 8 pages, References: 2 pages, Supplement: 4.5 pages, Main
paper: 3 figures, 2 tables, 1 algorithm, Supplement: 2 figure, 4 algorithms,
extended previous version by Differential Privacy, theoretical results and
more experiments. Updated author list as it was incomplet
Monatomic phase change memory
Phase change memory has been developed into a mature technology capable of
storing information in a fast and non-volatile way, with potential for
neuromorphic computing applications. However, its future impact in electronics
depends crucially on how the materials at the core of this technology adapt to
the requirements arising from continued scaling towards higher device
densities. A common strategy to finetune the properties of phase change memory
materials, reaching reasonable thermal stability in optical data storage,
relies on mixing precise amounts of different dopants, resulting often in
quaternary or even more complicated compounds. Here we show how the simplest
material imaginable, a single element (in this case, antimony), can become a
valid alternative when confined in extremely small volumes. This compositional
simplification eliminates problems related to unwanted deviations from the
optimized stoichiometry in the switching volume, which become increasingly
pressing when devices are aggressively miniaturized. Removing compositional
optimization issues may allow one to capitalize on nanosize effects in
information storage
- …